Combining word prediction and r-ary Huffman coding for text entry

نویسندگان

  • Seung Wook Kim
  • Frank Rudzicz
چکیده

Two approaches to reducing effort in switch-based text entry for augmentative and alternative communication devices are word prediction and efficient coding schemes, such as Huffman. However, character distributions that inform the latter have never accounted for the use of the former. In this paper, we provide the first combination of Huffman codes and word prediction, using both trigram and long short term memory (LSTM) language models. Results show a significant effect of the length of word prediction lists, and up to 41.46% switch-stroke savings using a trigram model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gestural Text Entry Using Huffman Codes

The H4 technique facilitates text entry with key sequences created using Huffman coding. This study evaluates the use of touch and motion-sensing gestures for H4 input. Touch input yielded better entry speeds (6.6 wpm, versus 5.3 wpm with motion-sensing) and more favourable participant feedback. Accuracy metrics did not differ significantly between the two conditions. Changes to the H4 techniqu...

متن کامل

Tight Bounds on the Redundancy of Huffman Codes

Consider a discrete finite source with N symbols, and with the probability distribution p := (u1, u2, . . . , uN). It is well-known that the Huffman encoding algorithm [1] provides an optimal prefix code for this source. A D-ary Huffman code is usually represented using a D-ary tree T , whose leaves correspond to the source symbols; The D edges emanating from each intermediate node of T are lab...

متن کامل

LIPT: A Reversible Lossless Text Transform to Improve Compression Performance

Lossless compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic encoding, the Lempel-Ziv family, Dynamic Markov Compression (DMC), Prediction by Partial Matching (PPM), and Burrows-Wheeler Transform (BWT) based algorithms. We propose an alternative approach in this paper to develop a reversible transformation that can be applied to a source ...

متن کامل

Twenty (or so) Questions: D-ary Bounded-Length Huffman Coding

The game of Twenty Questions has long been used to illustrate binary source coding. Recently, a physical device has been developed that mimics the process of playing Twenty Questions, with the device supplying the questions and the user providing the answers. However, this game differs from Twenty Questions in two ways: Answers need not be only “yes” and “no,” and the device continues to ask qu...

متن کامل

Improving Semistatic Compression Via Pair-Based Coding

In the last years, new semistatic word-based byte-oriented compressors, such as Plain and Tagged Huffman and the Dense Codes, have been used to improve the efficiency of text retrieval systems, while reducing the compressed collections to 30–35% of their original size. In this paper, we present a new semistatic compressor, called Pair-Based End-Tagged Dense Code (PETDC). PETDC compresses Englis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016